An MML classification of protein structure that knows about angles and sequence.

نویسندگان

  • T Edgoose
  • L Allison
  • D L Dowe
چکیده

The MML classification program, Snob, deals with mixture modelling (or clustering) of circular data. It has recently been extended to do Markov modelling of the serial correlation between clusters such as modelling the fact that a Helix cluster favours being followed by another Helix cluster. Such a model is better known as a Hidden Markov Model. The search for the most appropriate secondary structure classification of protein data is of significant importance and was addressed by Hunter and States (1992) using the Bayesian classifier, AutoClass, on Cartesian co-ordinate data of protein residues. Dowe et al. (1996) improved upon this earlier work by using Snob to cluster dihedral angle data, with the advantage that 3 x 3 = 9 Cartesian co-ordinates can be represented by the 2 orientation-invariant angles, phi and psi. The Hidden Markov Model used here is shown to be a more appropriate way again of modelling protein data and results in the selection of a simpler class model with 17 structure classes. We report on this classification, including the class transition matrix, and relate it back to the amino-acid sequence and the simple Helix, Beta, Turn classification. We find 3 types of Helix, 2 types of Beta and many types of Turn. The msot numerous Turn class defines a continuous flexible structure that is negatively correlated to all the other classes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION

This paper considers the generation of some interpretable fuzzy rules for assigning an amino acid sequence into the appropriate protein superfamily. Since the main objective of this classifier is the interpretability of rules, we have used the distribution of amino acids in the sequences of proteins as features. These features are the occurrence probabilities of six exchange groups in the seque...

متن کامل

In Silico Analysis of Primary Sequence and Tertiary Structure of Lepidium Draba Peroxidase

Peroxidase enzymes are vastly applicable in industry and diagnosiss. Recently, we introduced a new kind of peroxidase gene from Lepidium draba (LDP). According to protein multiple sequence alignment results, LDP had 93% similarity and 88.96% identity with horseradish peroxidase C1A (HRP C1A). In the current study we employed in silico tools to determine, to which group of peroxidase enzymes LDP...

متن کامل

Structural Characteristics of Stable Folding Intermediates of Yeast Iso-1-Cytochrome-c

Cytochrome-c (cyt-c) is an electron transport protein, and it is present throughout the evolution. More than 280 sequences have been reported in the protein sequence database (www.uniprot.org). Though sequentially diverse, cyt-c has essentially retained its tertiary structure or fold. Thus a vast data set of varied sequences with retention of similar structure and fun...

متن کامل

A Dihedral Angle Database of Short Sub-sequences for Protein Structure Prediction

Protein structure prediction is considered to be the holy grail of bioinformatics. Ab initio and homology modelling are two important groups of methods used in protein structure prediction. Amongst these, ab initio methods assume that no previous knowledge about protein structures is required. On the other hand homology modelling is based on sequence similarity and uses information such as clas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 1998